AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Large - scale corpus training

# Large - scale corpus training

Tucano 2b4
Apache-2.0
Tucano-2b4 is a large - scale language model that is natively pre - trained specifically for Portuguese. It is based on the Transformer architecture and trained on the GigaVerbo dataset with 200 billion tokens.
Large Language Model Transformers Other
T
TucanoBR
1,478
4
Roberta Base Turkish Uncased
MIT
This is a RoBERTa base model based on Turkish. The pre - training data is sourced from Turkish Wikipedia, the Turkish OSCAR corpus, and some news websites.
Large Language Model Transformers
R
TURKCELL
109
7
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase